Skip to content

Improve failure mode, add multiple DCs #1273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Improve failure mode, add multiple DCs #1273

wants to merge 4 commits into from

Conversation

as51340
Copy link
Contributor

@as51340 as51340 commented Apr 29, 2025

Release note

Documented types of failures tolerated with our current model of highly-available cluster. Documented possible architecture when multiple data centers are used.

Related product PRs

Checklist:

  • Add appropriate milestone (current release cycle)
  • Add bugfix or feature label, based on the product PR type you're documenting
  • Make sure all relevant tech details are documented
  • Check all content with Grammarly
  • Perform a self-review of my code
  • The build passes locally
  • My changes generate no new warnings or errors

@as51340 as51340 self-assigned this Apr 29, 2025
Copy link

vercel bot commented Apr 29, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
documentation ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 21, 2025 10:01am

Comment on lines 632 to 633
The architecture we currently use allows us to deploy coordinators in 3 data centers and hence tolerate a failure of the whole data center. Data instances can be freely
distributed in any way you want between data centers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extend this with some notes on the expected system requirements, e.g., the latency should be under N ms 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not necessary. Failover will be slower but slower network IMO still doesn't disqualify the architecture.

@as51340 as51340 added priority: low (improvements) An idea how the representation of knowledge on a certain page could be improved status: ready PR is ready for review labels May 2, 2025
@as51340 as51340 added this to the Memgraph 3.3 milestone May 2, 2025
Copy link
Contributor

@antejavor antejavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small typo + rewording.

Comment on lines +630 to +634
## Data center failure

The architecture we currently use allows us to deploy coordinators in 3 data centers and hence tolerate a failure of the whole data center. Data instances can be freely
distributed in any way you want between data centers. The failover time will be slighlty increased due to the network communication needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Data center failure
The architecture we currently use allows us to deploy coordinators in 3 data centers and hence tolerate a failure of the whole data center. Data instances can be freely
distributed in any way you want between data centers. The failover time will be slighlty increased due to the network communication needed.
## Data center failure
The architecture we currently use allows us to deploy coordinators in 3 data centers and hence tolerate a failure of the whole data center. Data instances can be freely
distributed in any way you want between data centers. The failover time will be slightly increased due to the need for network communication.

Copy link
Contributor

@antejavor antejavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also points to main branch.

@as51340
Copy link
Contributor Author

as51340 commented May 20, 2025

This also points to main branch.

Feel free to merge the suggestion. The pr should get into main because it's not connected to anything special in memgraph 3-3

@as51340 as51340 marked this pull request as ready for review May 20, 2025 13:48
@as51340 as51340 requested a review from katarinasupe as a code owner May 20, 2025 13:48
@antejavor
Copy link
Contributor

Cool @as51340, this is part of milestone 3.3, hence the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: low (improvements) An idea how the representation of knowledge on a certain page could be improved status: ready PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants